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Corporate Overview 




Global provider of voice biometric solutions 

Company name: Speech Technology Center, Ltd 

Core expertise: 



Voice identification and verification 


Professional audio recording 


Audio forensics 


Noise cancelation 



Location: 

Russia 

Germany 

Mexico 

USA (office in 2009) 

The year of foundation: 1990 
Staff 250 including 25 world-class PhD 
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Global Customer Base in More than 60 Countries 
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Strong R&D Capabilities 
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Ambitious and experienced team: 

+ One of the leading R&D teams (voice sector) in the world: over 100 technical 

specialists, scientists and software developers (including 25 PhDs), 5 certified audio 
forensic experts. 

Strong management and sales teams 
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Why Speech? 



Speech is a key 
communication tool in 
all fields of the human 
activity 
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Audio Forensics 
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Global leader in audio forensics 
Over 15 years of experience 



* 

* 

* 




Forensic speaker identification. 

Authenticity analysis of analog or digital 
audio recordings. 

Audio equipment for forensic 
examination and identification. 



Speech enhancement and audio 
restoration. 



Text transcription of low quality 
recordings. 



Audio Forensics 
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t Automatic algorithms for real-time noise suppression and speech 
enhancement. 




Sound Cleaner Premium - the first and the second prize in audio 

enhancement contest by AES 

(Audio Engineering Society), Denver, 2008 



+ Efficient suppression of all types of 
noises and distortions 

+ Adaptive algorithms of filtering 

^ Filters can be combined to process the 
record simultaneously 








Main Challenges 
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State-of-the-art voice-ID systems face four basic challenges: 

^ Ensuring robustness to noise (real life audio) 

+ Ensuring robust performance across different sound recording channels and levels of 
speaker stress 

+ Effective processing of large-scale (nation-wide) databases 
+ Language and context independent identification 



Speaker Identification Methods 
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Spectral-formant method 

^ Spectral-formant method (SFM) is based on the unique shape of each person’s vocal 
tract which is reflected in the visible speech of different people. 





An example of formant representation of the phrase “Forensic audio” pronounced by two 
different persons is shown in the picture (The horizontal axis is time in seconds. The 
vertical axis is frequency in Hertz. Energy level is depicted by the darkness of the trace). 





Speaker Identification Methods 
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Pitch statistics method 

^ Pitch statistics method (PSM) engages 16 different pitch parameters, including 

average pitch value, maximum, minimum, median, percent of areas with rising pitch, 
pitch logarithm variation, pitch logarithm asymmetry, pitch logarithm excess and 8 
parameters more. 




An example of automated pitch extraction in the phrase “Forensic audio” pronounced by 

two different persons is shown in the picture 




Speaker Identification Methods 
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GMM/SVM method 

^ In the GMM/SVM approach Gaussian mixtures are used to approximate statistical 
distributions of MFCC (Mel frequency cepstral coefficients) parameters extracted 
from speech of different speakers. 

+ Support Vector Machines are a robust classifier in multi-dimensional space. 



Peculiarities of Different Methods 




Speech 

Technology 

Center 



Dependence on speech signal characteristics 




Fusion Solution 
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Ability to work with signals from various communication channels 

Both microphone and telephone (landline, GSM) 

Robust to noise 

Low-quality signal processing (SNR down to 10 dB) 

Processing of short speech signals 

Speaker identification by a few seconds of speech 



Performance of Different Methods 



Database 

NIST SRE 2004 

Spectral-Formant method 

EER=13% 

Pitch statistics 

EER=15.9% 

GMM/SVM 

EER=7.5% 

Fusion 



EER=4.7% 
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Adaptation 
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Customization - ability to adapt the system to the key parameters of search 



Speech Database 



Adaptation of parameters - taking 
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Identification results 



Voice Identification for Experts 



Speech 

Technology 

Center 




^ TrawlLab - Facilitating voice ID analysis while carrying out multi-target 

forensic investigation by eliminating imposters and ranging the top-in-the-list 
speakers according to likelihood probability. 
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VoiceNet.lD is designed for: 

Reliable identification on a nation-wide voice database of speakers. 

VoiceNet.lD highlights 

+ Storage and real-time processing of large volume of voiceprints 
+ Client-server architecture 

4 Web-client 

+ Centralized speakers’ profiles repository 

^ Multi-user system 
^ Secure storage and access 

Remote access to the database 
^ Additional information storage (video, photo, text) 
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Architecture 
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Record LAN operator 
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Calculation cluster 
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Speaker’s profile card (SPC) 

Automatically extracts biometric traits of voice and speech from the attached sound 
records. Speaker card can contain wealth additional information about the person 
(text, photo, video etc). 
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Database management 

SPCs in the database can be organized into unlimited number of sections and sub- 
sections to facilitate further search. 
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Identification results 

list with indication of 
of a target speaker. 



The results of “VoiceNet.lD” search presented in the form of a 
likelihood probability (LR) of each record containing the speech 
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Status Creation Dati Execution Startup Da Duration 
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Completed 01.05.2009 1 01.05.2009 1 3:03 0:00:26 
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Technical specs: 

t DBMS - Oracle 1 1 g, PostgreSQL, ready to be adapted for others 

^ OS - UNIX (Solaris 10, Linux), Windows Server 2003 or later 

^ Web Service based architecture 

^ Application Server (GlassFish V3, Tomcat 6, ready to be adapted for others ) 

4 Cluster calculations JPPF 1.8 

Performance & scalability: 

+ Size - Database is scalable up to lO'OOO'OOO cards 

^ Speed - Performance directly linked to the computing power of a server (parallel 

calculation support) 

+ Tasks - The system can be adopted to any voice ID challenge (search for unknown 

speakers in the database or search for known speakers in the stream of 
audio files) 
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Thank you for your attention! 

/VW.SPEECHPRO.COM 

tel.: +7 812 331-0665 
fax: +7 812 327-9297 



